Approximate Value Iteration with Temporally Extended Actions (Extended Abstract)

نویسندگان

Timothy Arthur Mann

Shie Mannor

Doina Precup

چکیده

The options framework provides a concrete way to implement and reason about temporally extended actions. Existing literature has demonstrated the value of planning with options empirically, but there is a lack of theoretical analysis formalizing when planning with options is more efficient than planning with primitive actions. We provide a general analysis of the convergence rate of a popular Approximate Value Iteration (AVI) algorithm called Fitted Value Iteration (FVI) with options. Our analysis reveals that longer duration options and a pessimistic estimate of the value function both lead to faster convergence. Furthermore, options can improve convergence even when they are suboptimal and sparsely distributed throughout the state space. Next we consider generating useful options for planning based on a subset of landmark states. This suggests a new algorithm, Landmarkbased AVI (LAVI), that represents the value function only at landmark states. We analyze OFVI and LAVI using the proposed landmark-based options and compare the two algorithms. Our theoretical and experimental results demonstrate that options can play an important role in AVI by decreasing approximation error and inducing fast convergence.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Scaling Up Approximate Value Iteration with Options: Better Policies with Fewer Iterations

We show how options, a class of control structures encompassing primitive and temporally extended actions, can play a valuable role in planning in MDPs with continuous state-spaces. Analyzing the convergence rate of Approximate Value Iteration with options reveals that for pessimistic initial value function estimates, options can speed up convergence compared to planning with only primitive act...

متن کامل

Approximate Value Iteration with Temporally Extended Actions

Temporally extended actions have proven useful for reinforcement learning, but their duration also makes them valuable for efficient planning. The options framework provides a concrete way to implement and reason about temporally extended actions. Existing literature has demonstrated the value of planning with options empirically, but there is a lack of theoretical analysis formalizing when pla...

متن کامل

Theoretical Results on Reinforcement Learning with Temporally Abstract Options

We present new theoretical results on planning within the framework of temporally abstract reinforcement learning (Precup & Sutton, 1997; Sutton, 1995). Temporal abstraction is a key step in any decision making system that involves planning and prediction. In temporally abstract reinforcement learning, the agent is allowed to choose among ”options”, whole courses of action that may be temporall...

متن کامل

Theoretical Results on Reinforcement

We present new theoretical results on planning within the framework of temporally abstract reinforcement learning (Precup & Sut-ton, 1997; Sutton, 1995). Temporal abstraction is a key step in any decision making system that involves planning and prediction. In temporally abstract reinforcement learning, the agent is allowed to choose among "behaviors", whole courses of action that may be tempor...

متن کامل

Solving time-fractional chemical engineering equations by modified variational iteration method as fixed point iteration method

The variational iteration method(VIM) was extended to find approximate solutions of fractional chemical engineering equations. The Lagrange multipliers of the VIM were not identified explicitly. In this paper we improve the VIM by using concept of fixed point iteration method. Then this method was implemented for solving system of the time fractional chemical engineering equations. The ob...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2017

Approximate Value Iteration with Temporally Extended Actions (Extended Abstract)

نویسندگان

چکیده

منابع مشابه

Scaling Up Approximate Value Iteration with Options: Better Policies with Fewer Iterations

Approximate Value Iteration with Temporally Extended Actions

Theoretical Results on Reinforcement Learning with Temporally Abstract Options

Theoretical Results on Reinforcement

Solving time-fractional chemical engineering equations by modified variational iteration method as fixed point iteration method

عنوان ژورنال:

اشتراک گذاری